289 research outputs found

    NeRFs: The Search for the Best 3D Representation

    Full text link
    Neural Radiance Fields or NeRFs have become the representation of choice for problems in view synthesis or image-based rendering, as well as in many other applications across computer graphics and vision, and beyond. At their core, NeRFs describe a new representation of 3D scenes or 3D geometry. Instead of meshes, disparity maps, multiplane images or even voxel grids, they represent the scene as a continuous volume, with volumetric parameters like view-dependent radiance and volume density obtained by querying a neural network. The NeRF representation has now been widely used, with thousands of papers extending or building on it every year, multiple authors and websites providing overviews and surveys, and numerous industrial applications and startup companies. In this article, we briefly review the NeRF representation, and describe the three decades-long quest to find the best 3D representation for view synthesis and related problems, culminating in the NeRF papers. We then describe new developments in terms of NeRF representations and make some observations and insights regarding the future of 3D representations.Comment: Updated based on feedback in-person and via e-mail at SIGGRAPH 2023. In particular, I have added references and discussion of seminal SIGGRAPH image-based rendering papers, and better put the recent Kerbl et al. work in context, with more reference

    Light Field Blind Motion Deblurring

    Full text link
    We study the problem of deblurring light fields of general 3D scenes captured under 3D camera motion and present both theoretical and practical contributions. By analyzing the motion-blurred light field in the primal and Fourier domains, we develop intuition into the effects of camera motion on the light field, show the advantages of capturing a 4D light field instead of a conventional 2D image for motion deblurring, and derive simple methods of motion deblurring in certain cases. We then present an algorithm to blindly deblur light fields of general scenes without any estimation of scene geometry, and demonstrate that we can recover both the sharp light field and the 3D camera motion path of real and synthetically-blurred light fields.Comment: To be presented at CVPR 201

    Image to Image Translation for Domain Adaptation

    Full text link
    We propose a general framework for unsupervised domain adaptation, which allows deep neural networks trained on a source domain to be tested on a different target domain without requiring any training annotations in the target domain. This is achieved by adding extra networks and losses that help regularize the features extracted by the backbone encoder network. To this end we propose the novel use of the recently proposed unpaired image-toimage translation framework to constrain the features extracted by the encoder network. Specifically, we require that the features extracted are able to reconstruct the images in both domains. In addition we require that the distribution of features extracted from images in the two domains are indistinguishable. Many recent works can be seen as specific cases of our general framework. We apply our method for domain adaptation between MNIST, USPS, and SVHN datasets, and Amazon, Webcam and DSLR Office datasets in classification tasks, and also between GTA5 and Cityscapes datasets for a segmentation task. We demonstrate state of the art performance on each of these datasets

    Learning to Synthesize a 4D RGBD Light Field from a Single Image

    Full text link
    We present a machine learning algorithm that takes as input a 2D RGB image and synthesizes a 4D RGBD light field (color and depth of the scene in each ray direction). For training, we introduce the largest public light field dataset, consisting of over 3300 plenoptic camera light fields of scenes containing flowers and plants. Our synthesis pipeline consists of a convolutional neural network (CNN) that estimates scene geometry, a stage that renders a Lambertian light field using that geometry, and a second CNN that predicts occluded rays and non-Lambertian effects. Our algorithm builds on recent view synthesis methods, but is unique in predicting RGBD for each light field ray and improving unsupervised single image depth estimation by enforcing consistency of ray depths that should intersect the same scene point. Please see our supplementary video at https://youtu.be/yLCvWoQLnmsComment: International Conference on Computer Vision (ICCV) 201

    Creating Generative Models from Range Images

    Get PDF
    We describe a new approach for creating concise high-level generative models from one or more approximate range images. Using simple acquisition techniques and a user-defined class of models, our method produces a simple and intuitive object description that is relatively insensitive to noise and is easy to manipulate and edit. The algorithm has two inter-related phases -- recognition, which chooses an appropriate model within a given hierarchy, and parameter estimation, which adjusts the model to fit the data. We give a simple method for automatically making tradeoffs between simplicity and accuracy to determine the best model. We also describe general techniques to optimize a specific generative model. In particular, we address the problem of creating a suitable objective function that is sufficiently continuous for use with finite-difference based optimization techniques. Our technique for model recovery and subsequent manipulation and editing is demonstrated on real objects -- a spoon, bowl, ladle, and cup -- using a simple tree of possible generative models. We believe that higher-level model representations are extremely important, and their recovery for actual objects is a fertile area of research towards which this thesis is a step. However, our work is preliminary and there are currently several limitations. The user is required to create a model hierarchy (and supply methods to provide an initial guess for model parameters within this hierarchy); the use of a large pre-defined class of models can help alleviate this problem. Further, we have demonstrated our technique on only a simple tree of generative models. While our approach is fairly general, a real system would require a tree that is significantly larger. Our methods work only where the entire object can be accurately represented as a single generative model; future work could use constructive solid geometry operations on simple generative models to represent more complicated shapes. We believe that many of the above limitations can be addressed in future work, allowing us to easily acquire and process three-dimensional shape in a simple, intuitive and efficient manner

    Dynamic Splines with Constraints for Animation

    Get PDF
    In this paper, we present a method for fast interpolation between animation keyframes that allows for automatic computer-generated "improvement" of the motion. Our technique is closely related to conventional animation techniques, and can be used easily in conjunction with them for fast improvements of "rough" animations or for interpolation to allow sparser keyframing. We apply our technique to construction of splines in quaternion space where we show 100-fold speed-ups over previous methods. We also discuss our experiences with animation of an articulated human-like figure. Features of the method include: (1) Development of new subdivision techniques based on the Euler-Lagrange differential equations for splines in quaternion space; (2) An intuitive and simple set of coefficients to optimize over which is different from the conventional Bspline coefficients; (3) Widespread use of unconstrained minimization as opposed to constrained optimization needed by many previous methods. This speeds up the algorithm significantly, while still maintaining keyframe constraints accurately
    • …
    corecore